Clustering via Mode Seeking by Direct Estimation of the Gradient of a Log-Density
نویسندگان
چکیده
Mean shift clustering finds the modes of the data probability density by identifying the zero points of the density gradient. Since it does not require to fix the number of clusters in advance, the mean shift has been a popular clustering algorithm in various application fields. A typical implementation of the mean shift is to first estimate the density by kernel density estimation and then compute its gradient. However, since a good density estimation does not necessarily imply an accurate estimation of the density gradient, such an indirect two-step approach is not reliable. In this paper, we propose a method to directly estimate the gradient of the log-density without going through density estimation. The proposed method gives the global solution analytically and thus is computationally efficient. We then develop a mean-shift-like fixed-point algorithm to find the modes of the density for clustering. As in the mean shift, one does not need to set the number of clusters in advance. We experimentally show that the proposed clustering method significantly outperforms the mean shift especially for high-dimensional data.
منابع مشابه
Least-Squares Log-Density Gradient Clustering for Riemannian Manifolds
Mean shift is a mode-seeking clustering algorithm that has been successfully used in a wide range of applications such as image segmentation and object tracking. To further improve the clustering performance, mean shift has been extended to various directions, including generalization to handle data on Riemannian manifolds and extension to directly estimating the log-density gradient without de...
متن کاملRegularized Multitask Learning for Multidimensional Log-Density Gradient Estimation
Log-density gradient estimation is a fundamental statistical problem and possesses various practical applications such as clustering and measuring nongaussianity. A naive two-step approach of first estimating the density and then taking its log gradient is unreliable because an accurate density estimate does not necessarily lead to an accurate log-density gradient estimate. To cope with this pr...
متن کاملKNN Kernel Shift Clustering with Highly Effective Memory Usage
This paper presents a novel clustering algorithm with highly effective memory usage. The algorithm, called kNN kernel shift, classifies samples based on underlying probability density function. In clustering algorithms based on density, a local mode of the density represents a cluster center. It is effective to shift each sample to a point having higher density, considering the density gradient...
متن کاملThe transition energy and the beaming angle of converted LO-mode waves from 100 to 400 kHz through density gradient according to observations of kilometric continuum radiations in the plasmapause
The satellite observations such as the Cluster mission with four-point measurements show some local fluctuations in the density gradient in the vicinity of the plasmapause. These structures are found over a broad range of spatial scales, with a size from 20 to 5000 km. Also, the simultaneous observations of the kilometric continuum by IMAGE (Imager for Magnetopause-to-Aurora Global Exploration)...
متن کاملEstimation of Total Organic Carbon from well logs and seismic sections via neural network and ant colony optimization approach: a case study from the Mansuri oil field, SW Iran
In this paper, 2D seismic data and petrophysical logs of the Pabdeh Formation from four wells of the Mansuri oil field are utilized. ΔLog R method was used to generate a continuous TOC log from petrophysical data. The calculated TOC values by ΔLog R method, used for a multi-attribute seismic analysis. In this study, seismic inversion was performed based on neural networks algorithm and the resu...
متن کامل